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DETAILED ACTION 

1 . This is a Non-Final rejection in response to RCE filed on 03-29-2007. 

2. Claims 1, 1 1-22, 24, 34-35, 45 and 47-52 are pending. Claims 2-10, 23, 25-33, and 46 
are cancelled. Claims 12-21, and 36-44 are withdrawn from consideration (Non-elected 
claims). Claims 47-52 are newly added. 

3. Effective filing date is 1 1-03-2003 (Assignee: Adobe). 



Continued Examination Under 3 7 CFR 1.114 
4. A request for continued examination under 37 CFR 1.114, including the fee set forth in 
37 CFR 1 .17(e), was filed in this application after final rejection. Since this application is 
eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) 
has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 
37 CFR 1.114. Applicant's submission filed on 03-29-2007 has been entered. 
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Claim Rejections - 35 USC § 103 
5. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth 
in section 102 of this title, if the differences between the subject matter sought to be patented and the prior 
art are such that the subject matter as a whole would have been obvious at the time the invention was made 
to a person having ordinary skill in the art to which said subject matter pertains. Patentability shall not be 
negatived by the manner in which the invention was made. 

Claims 1, 11-12, 22, 24, 34-35, 45, and 47-52 are rejected under 35 U.S.C. 103(a) as 
being unpatentable over Gupta et al. US 2007001 1206A1 Con of 09/396,702 filed 09-14- 
1999 (hereinafter Gupta), in view of Dodrill et al. US006766298B1 filed 01-11-2000 
(hereinafter Dodrill). 

Regarding independent claim 1, Gupta teaches: 

A computer-implemented method for generating an audio-based form 
including one or more data fields, the method comprising: defining zoning 
information identifying a temporal location and temporal dimensions of the 
one or more data fields of the audio-based form. 
(See Gupta para 6 discloses a multimedia presentations, includes "annotations" relating to the 
multimedia presentation. An annotation is data (e.g., audio, text, video, etc.) that corresponds to a 
multimedia presentation. These annotations typically correspond to a particular temporal location 
in the multimedia presentation. 

Also, see Gupta para 43, teaching Annotation Back End (ABE) 132 of annotation server 
1 0 also manages the interactive generation and presentation of streaming media data from server 
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computer 1 1 of FIG. 1 using "playlists" (i.e. A "playlist" is a listing of one or more multimedia 
segments to be retrieved and presented in a given order) each of the multimedia segments in the 
playlist is defined by a source identifier, a start time, and an end time. The source identifier 
identifies which media stream segment is part of the temporal location within the media stream. 

Also, see Gupta fig. 4 and para 45, discloses an annotation entry fields item 182-204. 

Also, See Gupta fig. 7 and para 70, teaching a thumb 265 that moves within time strip 
264 indicates a particular temporal position within the media stream. 



^ ^260 




fa 7 



Using the broadest reasonable interpretation, the examiner equates the claimed audio-based 
form as equivalent to audio multimedia presentation includes "playlist", which is a listing of one 
or more multimedia segments to be retrieved and presented in a given order as taught by Gupta. 
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In addition, Gupta does not explicitly teach, but Dodrill teaches: 

defining structural information including a name for each of the one 
or more data fields and a description of a type of user data expected to be 
provided for each of the one or more data fields; encoding the zoning and 
structural information in one or more audio signals; and incorporating the 
one or more audio signals including the encoded zoning and structural 
information into the audio-based form. 
(See Dodrill col 11, lines 15-40, teaching XML pages define logical operations, where the 
parameters and attributes may be set and compared and used by the application logic. The XML 
document may specify logic in the form of menu structures, equivalent to if/then/else statements, 

Also, see Dodrill col. 11 lines 54-60, teaching decision XML documents also include 
activity tags but do not rely on user input; rather, the decision XML document includes options 
tags that specify the actions to be taken based on the respective values returned by the procedure 
call specified by the activity tag. 

Also, see Dodrill fig. 7 and col. 1 1 line 65 through col. 8 line 30, illustrating the web 
page 190 includes a standard embed tag 200 in HTML format, and an in line XML portion 202 
that includes media control information, such as a prompt list 204 (i.e. For example, the prompt 
list 204 specifies an audio file "wavfile.wav" to be played by the browser, for example as a 
welcome greeting. If the plug-in resource 86 in the browser is XML control aware, then the 
XML aware audio resource 86 begins to play the audio files "wavurll" and "wavurl2" in the 
prescribed sequence. For example, the XML aware audio resource 86 plays a "Good Morning" 
prompt for wavurll and "Enter Your Phone Number followed by the Pound (#) Key" prompt for 
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wavurl2, while waiting for an input pattern ([0-9][7,9]#). This exemplary pattern ([0-9][7,9]#) 
specifies to the XML aware audio resource 86 that a valid input is composed of any string of the 
characters 0 through 9 for a length of 7 to 9 digits, followed by a pound key. The XML aware 
audio resource 86 continues to play the audio files in the prescribed sequence while waiting for 
the user to input a key entry). The browser can then "quietly" post the recorded audio file to the 
"upload URL" specified in the record tag 206, and then post the user input (e.g., as specified in 
the HTML form 208) to another URL specified within the HTML form 208). 



<HTML> 
<HEAD> 

<TITLE> Hello </TITLE> 
<DODY> 

<FORM> 

208 — (Form contents) 
</FORM> 

200-"*" <EMBED file="hltp://seiver/wavdireclory/wavfile.waw" 

autostart = true> 

<XML> 

-<PROMPTLIST> 

<PROMPT type = "wav" name * *wavurl1"/> 
<PROMPT type = 'wav' name = "wavurl27> 
</PROMPTLlST> 

-<RECORO upload = "upload URL" 

filename » "local lilename to use" 
max length = "maximum record length" 



202 



204 



206' 



> FIG. 7 



</XML> 
</HTML> 

</body> 
</hlml> 



190 



Using the broadest reasonable interpretation, the examiner equates the claimed the zoning and 
structural information in one or more audio signals as equivalent to the XML aware audio 
resource 86 plays a "Good Morning" prompt for wavurll and "Enter Your Phone Number 
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followed by the Pound (#) Key" prompt for wavurl2, while waiting for an input pattern ([0- 
9][7,9]#). This exemplary pattern ([0-9][7,9]#) specifies to the XML aware audio resource 86 
that a valid input is composed of any string of the characters 0 through 9 for a length of 7 to 9 
digits, followed by a pound key. The XML aware audio resource 86 continues to play the audio 
files in the prescribed sequence while waiting for the user to input a key entry as taught by 
Dodrill. 

It would have been obvious to a person of ordinary skill in the art at the time the 
invention was made to have modified Gupta's audio media stream segment is part of the 
temporal location within the media stream, to include a means of defining structural 
information including a name for each of the one or more data fields and a description of a type 
of user data expected to be provided for each of the one or more data fields; encoding the 
zoning and structural information in one or more audio signals; and incorporating the one or 
more audio signals including the encoded zoning and structural information into the audio- 
based form as taught by Dodrill. One of the ordinary skills in the art would have been 
motivated to modify this combination, because they are from the same field of endeavor of xml 
voice enable and synchronized media that share a common timeline, and unified web-based 
voice messaging system provides voice application control between a web browser and an 
application server via an hypertext transport protocol (HTTP) connection on an Internet 
Protocol (IP) network. The web browser receives an HTML page from the application server 
having an XML element that defines data for an audio operation to be performed by an 
executable audio resource (see Dodrill at the abstract). 
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Regarding independent claim 12, the rejection of claim 1 is fully incorporated and is similarly 
rejected along the same rationale. 
In addition, Gupta teaches: 

generating a form definition defining the audio-based form, 
(See Gupta para 43, teaching Annotation Back End (ABE) 132 of annotation server 10 also 
manages the interactive generation and presentation of streaming media data from server 
computer 1 1 of FIG. 1 using "playlists" (i.e. A "playlist" is a listing of one or more multimedia 
segments to be retrieved and presented in a given order) each of the multimedia segments in the 
playlist is defined by a source identifier, a start time, and an end time, the source identifier 
identifies which media stream segment is part of the temporal location within the media stream. 
Using the broadest reasonable interpretation, the examiner equates the claimed audio-based 
form as equivalent to audio multimedia presentation includes "playlist", which is a listing of one 
or more multimedia segments to be retrieved and presented in a given order as taught by Gupta. 

wherein audio data entered into the audio-based form by a user can be 
extracted from the audio-based form based on the encoded zoning and 
structural information without access to a source of zoning or structural 
information external to the audio-based form. 
(See Gupta para 43, teaching Annotation Back End (ABE) 132 of annotation server 10 also 
manages the interactive generation and presentation of streaming media data from server 
computer 1 1 of FIG. 1 using "playlists" (i.e. A "playlist" is a listing of one or more multimedia 
segments to be retrieved and presented in a given order) each of the multimedia segments in the 
playlist is defined by a source identifier, a start time, and an end time. The source identifier 
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identifies which media stream segment is part of the temporal location within the media stream. 
Using the broadest reasonable interpretation, the examiner equates the claimed audio-based 
form as equivalent to audio multimedia presentation, includes "playlist", which is a listing of one 
or more multimedia segments to be retrieved and presented in a given order as taught by Gupta. 

Regarding independent claim 22, the rejection of claim 12 is fully incorporated and is similarly 
rejected along the same rationale. 

Regarding independent claim 24 is directed toward a computer program product performing 
the method of claim 12 and is similarly rejected under the same rationale. 

Regarding independent claims 35 and 45 is directed toward a computer program performing 
the method of claim 12 and are similarly rejected under the same rationale. 

Regarding claim 11, the rejection of claim 12 is fully incorporated and is similarly rejected 
along the same rationale. 

Regarding claim 34 is directed toward a computer program performing the method of claim 1 1 
and are similarly rejected under the same rationale. 
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Regarding claim 47, Gupta does not explicitly teach, but Dodrill teaches: 

encoding instructions indicating where and how to transmit user data 
extracted from the audio-based form into one or more audio signals; and 
incorporating the one or more audio signals including the encoded 
instructions into the audio-based form. 
(See Dodrill col 11, lines 15-40, teaching XML pages define logical operations, where the 
parameters and attributes may be set and compared and used by the application logic. The XML 
document may specify logic in the form of menu structures, equivalent to if/then/else statements, 

Also, see Dodrill col. 1 1 lines 54-60, teaching decision XML documents also include 
activity tags but do not rely on user input; rather, the decision XML document includes options 
tags that specify the actions to be taken based on the respective values returned by the procedure 
call specified by the activity tag. 

Also, see Dodrill fig. 7 and col. 1 1 line 65 through col. 8 line 30, illustrating the web 
page 190 includes a standard embed tag 200 in HTML format, and an in line XML portion 202 
that includes media control information, such as a prompt list 204 (i.e. For example, the prompt 
list 204 specifies an audio file "wavfile.wav" to be played by the browser, for example as a 
welcome greeting. If the plug-in resource 86 in the browser is XML control aware, then the 
XML aware audio resource 86 begins to play the audio files "wavurll" and "wavurl2" in the 
prescribed sequence. For example, the XML aware audio resource 86 plays a "Good Morning" 
prompt for wavurll and "Enter Your Phone Number followed by the Pound (#) Key" prompt for 
wavurl2, while waiting for an input pattern ([0-9][7,9]#). This exemplary pattern ([0-9][7,9]#) 
specifies to the XML aware audio resource 86 that a valid input is composed of any string of the 
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characters 0 through 9 for a length of 7 to 9 digits, followed by a pound key. The XML aware 
audio resource 86 continues to play the audio files in the prescribed sequence while waiting for 
the user to input a key entry). The browser can then "quietly" post the recorded audio file to the 
"upload URL" specified in the record tag 206, and then post the user input (e.g., as specified in 
the HTML form 208) to another URL specified within the HTML form 208). 

It would have been obvious to a person of ordinary skill in the art at the time the 
invention was made to have modified Gupta's audio media stream segment is part of the 
temporal location within the media stream, to include a means of encoding instructions 
indicating where and how to transmit user data extracted from the audio-based form into one or 
more audio signals; and incorporating the one or more audio signals including the encoded 
instructions into the audio-based form as taught by Dodrill. One of the ordinary skills in the art 
would have been motivated to modify this combination, because they are from the same field of 
endeavor of xml voice enable and synchronized media that share a common timeline, and 
unified web-based voice messaging system provides voice application control between a web 
browser and an application server via an hypertext transport protocol (HTTP) connection on an 
Internet Protocol (IP) network. The web browser receives an HTML page from the application 
server having an XML element that defines data for an audio operation to be performed by an 
executable audio resource (see Dodrill at the abstract). 

Regarding claim 48, Gupta does not explicitly teach, but Dodrill teaches: 

encoding instructions indicating where and how to transmit user data 
extracted from the audio-based form into one or more audio signals; and 
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incorporating the one or more audio signals including the encoded 
instructions into the audio-based form. 

(See Dodrill col 11, lines 15-40, teaching XML pages define logical operations, where the 
parameters and attributes may be set and compared and used by the application logic. The XML 
document may specify logic in the form of menu structures, equivalent to if/then/else statements, 

Also, see Dodrill col. 1 1 lines 54-60, teaching decision XML documents also include 
activity tags but do not rely on user input; rather, the decision XML document includes options 
tags that specify the actions to be taken based on the respective values returned by the procedure 
call specified by the activity tag. 

Also, see Dodrill fig. 7 and col. 1 1 line 65 through col. 8 line 30, illustrating the web 
page 190 includes a standard embed tag 200 in HTML format, and an in line XML portion 202 
that includes media control information, such as a prompt list 204 (i.e. For example, the prompt 
list 204 specifies an audio file "wavfile.wav" to be played by the browser, for example as a 
welcome greeting. If the plug-in resource 86 in the browser is XML control aware, then the 
XML aware audio resource 86 begins to play the audio files "wavurll" and "wavurl2" in the 
prescribed sequence. For example, the XML aware audio resource 86 plays a "Good Morning' 1 
prompt for wavurll and "Enter Your Phone Number followed by the Pound (#) Key" prompt for 
wavurl2, while waiting for an input pattern ([0-9][7,9]#). This exemplary pattern ([0-9][7,9]#) 
specifies to the XML aware audio resource 86 that a valid input is composed of any string of the 
characters 0 through 9 for a length of 7 to 9 digits, followed by a pound key. The XML aware 
audio resource 86 continues to play the audio files in the prescribed sequence while waiting for 
the user to input a key entry). The browser can then "quietly" post the recorded audio file to the 
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"upload URL" specified in the record tag 206, and then post the user input (e.g., as specified in 
the HTML form 208) to another URL specified within the HTML form 208). 

It would have been obvious to a person of ordinary skill in the art at the time the 
invention was made to have modified Gupta's audio media stream segment is part of the 
temporal location within the media stream, to include a means of encoding instructions 
indicating where and how to transmit user data extracted from the audio-based form into one or 
more audio signals; and incorporating the one or more audio signals including the encoded 
instructions into the audio-based form as taught by Dodrill. One of the ordinary skills in the art 
would have been motivated to modify this combination, because they are from the same field of 
endeavor of xml voice enable and synchronized media that share a common timeline, and 
unified web-based voice messaging system provides voice application control between a web 
browser and an application server via an hypertext transport protocol (HTTP) connection on an 
Internet Protocol (IP) network. The web browser receives an HTML page from the application 
server having an XML element that defines data for an audio operation to be performed by an 
executable audio resource (see Dodrill at the abstract). 

Regarding claim 49, Gupta does not explicitly teach, but Dodrill teaches: 

encoding instructions indicating where and how to transmit user data 
extracted from the audio-based form into one or more audio signals; and 
incorporating the one or more audio signals including the encoded 
instructions into the audio-based form. 
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(See Dodrill col 11, lines 15-40, teaching XML pages define logical operations, where the 
parameters and attributes may be set and compared and used by the application logic. The XML 
document may specify logic in the form of menu structures, equivalent to if/then/else statements, 

Also, see Dodrill col. 1 1 lines 54-60, teaching decision XML documents also include 
activity tags but do not rely on user input; rather, the decision XML document includes options 
tags that specify the actions to be taken based on the respective values returned by the procedure 
call specified by the activity tag. 

Also, see Dodrill fig. 7 and col. 1 1 line 65 through col. 8 line 30, illustrating the web 
page 190 includes a standard embed tag 200 in HTML format, and an in line XML portion 202 
that includes media control information, such as a prompt list 204 (i.e. For example, the prompt 
list 204 specifies an audio file "wavfile.wav" to be played by the browser, for example as a 
welcome greeting. If the plug-in resource 86 in the browser is XML control aware, then the 
XML aware audio resource 86 begins to play the audio files "wavurll" and H wavurl2" in the 
prescribed sequence. For example, the XML aware audio resource 86 plays a "Good Morning" 
prompt for wavurll and "Enter Your Phone Number followed by the Pound (#) Key" prompt for 
wavurl2, while waiting for an input pattern ([0-9][7,9]#). This exemplary pattern ([0-9][7,9]#) 
specifies to the XML aware audio resource 86 that a valid input is composed of any string of the 
characters 0 through 9 for a length of 7 to 9 digits, followed by a pound key. The XML aware 
audio resource 86 continues to play the audio files in the prescribed sequence while waiting for 
the user to input a key entry). The browser can then "quietly" post the recorded audio file to the 
"upload URL" specified in the record tag 206, and then post the user input (e.g., as specified in 
the HTML form 208) to another URL specified within the HTML form 208). 
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It would have been obvious to a person of ordinary skill in the art at the time the 
invention was made to have modified Gupta's audio media stream segment is part of the 
temporal location within the media stream, to include a means of encoding instructions 
indicating where and how to transmit user data extracted from the audio-based form into one or 
more audio signals; and incorporating the one or more audio signals including the encoded 
instructions into the audio-based form as taught by Dodrill. One of the ordinary skills in the art 
would have been motivated to modify this combination, because they are from the same field of 
endeavor of xml voice enable and synchronized media that share a common timeline, and 
unified web-based voice messaging system provides voice application control between a web 
browser and an application server via an hypertext transport protocol (HTTP) connection on an 
Internet Protocol (IP) network. The web browser receives an HTML page from the application 
server having an XML element that defines data for an audio operation to be performed by an 
executable audio resource (see Dodrill at the abstract). 

Regarding claim 50, Gupta does not explicitly teach, but Dodrill teaches: 

encoding instructions indicating where and bow to transmit user data 
extracted from the audio-based form into one or more audio signals; and 
incorporating the one or more audio signals including the encoded 
instructions into the audio-based form. 
(See Dodrill col 11, lines 15-40, teaching XML pages define logical operations, where the 
parameters and attributes may be set and compared and used by the application logic. The XML 
document may specify logic in the form of menu structures, equivalent to if/then/else statements, 
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Also, see Dodrill col. 1 1 lines 54-60, teaching decision XML documents also include 
activity tags but do not rely on user input; rather, the decision XML document includes options 
tags that specify the actions to be taken based on the respective values returned by the procedure 
call specified by the activity tag. 

Also, see Dodrill fig. 7 and col. 1 1 line 65 through col. 8 line 30, illustrating the web 
page 190 includes a standard embed tag 200 in HTML format, and an in line XML portion 202 
that includes media control information, such as a prompt list 204 (i.e. For example, the prompt 
list 204 specifies an audio file "wavfile.wav" to be played by the browser, for example as a 
welcome greeting. If the plug-in resource 86 in the browser is XML control aware, then the 
XML aware audio resource 86 begins to play the audio files "wavurll" and "wavurl2" in the 
prescribed sequence. For example, the XML aware audio resource 86 plays a "Good Morning" 
prompt for wavurll and "Enter Your Phone Number followed by the Pound (#) Key" prompt for 
wavurl2, while waiting for an input pattern ([0-9][7,9]#). This exemplary pattern ([0-9][7,9]#) 
specifies to the XML aware audio resource 86 that a valid input is composed of any string of the 
characters 0 through 9 for a length of 7 to 9 digits, followed by a pound key. The XML aware 
audio resource 86 continues to play the audio files in the prescribed sequence while waiting for 
the user to input a key entry). The browser can then "quietly" post the recorded audio file to the 
"upload URL" specified in the record tag 206, and then post the user input (e.g., as specified in 
the HTML form 208) to another URL specified within the HTML form 208). 

It would have been obvious to a person of ordinary skill in the art at the time the 
invention was made to have modified Gupta's audio media stream segment is part of the 
temporal location within the media stream, to include a means of encoding instructions 
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indicating where and how to transmit user data extracted from the audio-based form into one or 
more audio signals; and incorporating the one or more audio signals including the encoded 
instructions into the audio-based form as taught by Dodrill. One of the ordinary skills in the art 
would have been motivated to modify this combination, because they are from the same field of 
endeavor of xml voice enable and synchronized media that share a common timeline, and 
unified web-based voice messaging system provides voice application control between a web 
browser and an application server via an hypertext transport protocol (HTTP) connection on an 
Internet Protocol (IP) network. The web browser receives an HTML page from the application 
server having an XML element that defines data for an audio operation to be performed by an 
executable audio resource (see Dodrill at the abstract). 
Regarding claim 51, Gupta does not explicitly teach, but Dodrill teaches: 

encoding instructions indicating where and how to transmit user data 
extracted from the audio-based form into one or more audio signals; and 
incorporating the one or more audio signals including the encoded 
instructions into the audio-based form. 
(See Dodrill col 11, lines 15-40, teaching XML pages define logical operations, where the 
parameters and attributes may be set and compared and used by the application logic. The XML 
document may specify logic in the form of menu structures, equivalent to if/then/else statements, 

Also, see Dodrill col. 11 lines 54-60, teaching decision XML documents also include 
activity tags but do not rely on user input; rather, the decision XML document includes options 
tags that specify the actions to be taken based on the respective values returned by the procedure 
call specified by the activity tag. 
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Also, see Dodrill fig. 7 and col. 1 1 line 65 through col. 8 line 30, illustrating the web 
page 190 includes a standard embed tag 200 in HTML format, and an in line XML portion 202 
that includes media control information, such as a prompt list 204 (i.e. For example, the prompt 
list 204 specifies an audio file "wavfile.wav" to be played by the browser, for example as a 
welcome greeting. If the plug-in resource 86 in the browser is XML control aware, then the 
XML aware audio resource 86 begins to play the audio files "wavurll" and "wavurl2" in the 
prescribed sequence. For example, the XML aware audio resource 86 plays a "Good Morning" 
prompt for wavurll and "Enter Your Phone Number followed by the Pound (#) Key" prompt for 
wavurl2, while waiting for an input pattern ([0-9][7,9]#). This exemplary pattern ([0-9][7,9]#) 
specifies to the XML aware audio resource 86 that a valid input is composed of any string of the 
characters 0 through 9 for a length of 7 to 9 digits, followed by a pound key. The XML aware 
audio resource 86 continues to play the audio files in the prescribed sequence while waiting for 
the user to input a key entry). The browser can then "quietly" post the recorded audio file to the 
"upload URL" specified in the record tag 206, and then post the user input (e.g., as specified in 
the HTML form 208) to another URL specified within the HTML form 208). 

It would have been obvious to a person of ordinary skill in the art at the time the 
invention was made to have modified Gupta's audio media stream segment is part of the 
temporal location within the media stream, to include a means of encoding instructions 
indicating where and how to transmit user data extracted from the audio-based form into one or 
more audio signals; and incorporating the one or more audio signals including the encoded 
instructions into the audio-based form as taught by Dodrill. One of the ordinary skills in the art 
would have been motivated to modify this combination, because they are from the same field of 
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endeavor of xml voice enable and synchronized media that share a common timeline, and 
unified web-based voice messaging system provides voice application control between a web 
browser and an application server via an hypertext transport protocol (HTTP) connection on an 
Internet Protocol (IP) network. The web browser receives an HTML page from the application 
server having an XML element that defines data for an audio operation to be performed by an 
executable audio resource (see Dodrill at the abstract). 

Regarding claim 52, Gupta does not explicitly teach, but Dodrill teaches: 

encoding instructions indicating where and how to transmit user data 
extracted from the audio-based form into one or more audio signals; and 
incorporating the one or more audio signals including the encoded 
instructions into the audio-based form. 
(See Dodrill col 11, lines 15-40, teaching XML pages define logical operations, where the 
parameters and attributes may be set and compared and used by the application logic. The XML 
document may specify logic in the form of menu structures, equivalent to if/then/else statements, 

Also, see Dodrill col. 1 1 lines 54-60, teaching decision XML documents also include 
activity tags but do not rely on user input; rather, the decision XML document includes options 
tags that specify the actions to be taken based on the respective values returned by the procedure 
call specified by the activity tag. 

Also, see Dodrill fig. 7 and col. 1 1 line 65 through col. 8 line 30, illustrating the web 
page 190 includes a standard embed tag 200 in HTML format, and an in line XML portion 202 
that includes media control information, such as a prompt list 204 (i.e. For example, the prompt 
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list 204 specifies an audio file "wavfile.wav" to be played by the browser, for example as a 
welcome greeting. If the plug-in resource 86 in the browser is XML control aware, then the 
XML aware audio resource 86 begins to play the audio files "wavurll" and "wavurl2" in the 
prescribed sequence. For example, the XML aware audio resource 86 plays a "Good Morning" 
prompt for wavurll and "Enter Your Phone Number followed by the Pound (#) Key" prompt for 
wavurl2, while waiting for an input pattern ([0-9][7,9]#). This exemplary pattern ([0-9][7,9]#) 
specifies to the XML aware audio resource 86 that a valid input is composed of any string of the 
characters 0 through 9 for a length of 7 to 9 digits, followed by a pound key. The XML aware 
audio resource 86 continues to play the audio files in the prescribed sequence while waiting for 
the user to input a key entry). The browser can then "quietly" post the recorded audio file to the 
"upload URL" specified in the record tag 206, and then post the user input (e.g., as specified in 
the HTML form 208) to another URL specified within the HTML form 208). 

It would have been obvious to a person of ordinary skill in the art at the time the 
invention was made to have modified Gupta's audio media stream segment is part of the 
temporal location within the media stream, to include a means of encoding instructions 
indicating where and how to transmit user data extracted from the audio-based form into one or 
more audio signals; and incorporating the one or more audio signals including the encoded 
instructions into the audio-based form as taught by Dodrill. One of the ordinary skills in the art 
would have been motivated to modify this combination, because they are from the same field of 
endeavor of xml voice enable and synchronized media that share a common timeline, and 
unified web-based voice messaging system provides voice application control between a web 
browser and an application server via an hypertext transport protocol (HTTP) connection on an 
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Internet Protocol (IP) network. The web browser receives an HTML page from the application 
server having an XML element that defines data for an audio operation to be performed by an 
executable audio resource (see Dodrill at the abstract). 

6. It is noted that any citations to specific, pages, columns, lines, or figures in the prior art 
references and any interpretation of the references should not be considered to be limiting in any 
way. A reference is relevant for all it contains and may be relied upon for all that it would have 
reasonably suggested to one having ordinary skill in the art. See, MPEP 2123. 

Response to Argument 

7. Applicant's arguments with respect to claims 1,11-12, 22, 34-35, and 45 have been 
considered but are moot in view of the new ground(s) of rejection. This office action is a Non- 
Final Rejection in order to give the applicant sufficient opportunity to response to the new line of 
rejection. 

Conclusion 

8. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Quoc A. Tran whose telephone number is 571-272-8664. The 
examiner can normally be reached on 9AM - 5PM EST. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, 
Herndon R. Heather can be reached on 571-272-4136. The fax phone number for the 
organization where this application or proceeding is assigned is (571)-273-8300. 
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Information regarding the status of an application may be obtained from the Patent 
Application Information Retrieval (PAIR) system. Status information for published applications 
may be obtained from either Private PAIR or Public PAIR. Status information for unpublished 
applications is available through Private PAIR only. For more information about the PAIR 
system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR 
system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). 
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